Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
1.
Nat Commun ; 14(1): 2914, 2023 05 22.
Article in English | MEDLINE | ID: covidwho-2322120

ABSTRACT

Long COVID, or complications arising from COVID-19 weeks after infection, has become a central concern for public health experts. The United States National Institutes of Health founded the RECOVER initiative to better understand long COVID. We used electronic health records available through the National COVID Cohort Collaborative to characterize the association between SARS-CoV-2 vaccination and long COVID diagnosis. Among patients with a COVID-19 infection between August 1, 2021 and January 31, 2022, we defined two cohorts using distinct definitions of long COVID-a clinical diagnosis (n = 47,404) or a previously described computational phenotype (n = 198,514)-to compare unvaccinated individuals to those with a complete vaccine series prior to infection. Evidence of long COVID was monitored through June or July of 2022, depending on patients' data availability. We found that vaccination was consistently associated with lower odds and rates of long COVID clinical diagnosis and high-confidence computationally derived diagnosis after adjusting for sex, demographics, and medical history.


Subject(s)
COVID-19 , Post-Acute COVID-19 Syndrome , United States/epidemiology , Humans , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19 Vaccines , Cohort Studies , SARS-CoV-2 , Vaccination
2.
EBioMedicine ; 87: 104413, 2023 Jan.
Article in English | MEDLINE | ID: covidwho-2165228

ABSTRACT

BACKGROUND: Stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, long COVID is incompletely understood and characterised by a wide range of manifestations that are difficult to analyse computationally. Additionally, the generalisability of machine learning classification of COVID-19 clinical outcomes has rarely been tested. METHODS: We present a method for computationally modelling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning. FINDINGS: We found six clusters of PASC patients, each with distinct profiles of phenotypic abnormalities, including clusters with distinct pulmonary, neuropsychiatric, and cardiovascular abnormalities, and a cluster associated with broad, severe manifestations and increased mortality. There was significant association of cluster membership with a range of pre-existing conditions and measures of severity during acute COVID-19. We assigned new patients from other healthcare centres to clusters by maximum semantic similarity to the original patients, and showed that the clusters were generalisable across different hospital systems. The increased mortality rate originally identified in one cluster was consistently observed in patients assigned to that cluster in other hospital systems. INTERPRETATION: Semantic phenotypic clustering provides a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC. FUNDING: NIH (TR002306/OT2HL161847-01/OD011883/HG010860), U.S.D.O.E. (DE-AC02-05CH11231), Donald A. Roux Family Fund at Jackson Laboratory, Marsico Family at CU Anschutz.


Subject(s)
COVID-19 , Post-Acute COVID-19 Syndrome , Humans , Disease Progression , SARS-CoV-2
3.
Lancet Digit Health ; 4(7): e532-e541, 2022 07.
Article in English | MEDLINE | ID: covidwho-1852294

ABSTRACT

BACKGROUND: Post-acute sequelae of SARS-CoV-2 infection, known as long COVID, have severely affected recovery from the COVID-19 pandemic for patients and society alike. Long COVID is characterised by evolving, heterogeneous symptoms, making it challenging to derive an unambiguous definition. Studies of electronic health records are a crucial element of the US National Institutes of Health's RECOVER Initiative, which is addressing the urgent need to understand long COVID, identify treatments, and accurately identify who has it-the latter is the aim of this study. METHODS: Using the National COVID Cohort Collaborative's (N3C) electronic health record repository, we developed XGBoost machine learning models to identify potential patients with long COVID. We defined our base population (n=1 793 604) as any non-deceased adult patient (age ≥18 years) with either an International Classification of Diseases-10-Clinical Modification COVID-19 diagnosis code (U07.1) from an inpatient or emergency visit, or a positive SARS-CoV-2 PCR or antigen test, and for whom at least 90 days have passed since COVID-19 index date. We examined demographics, health-care utilisation, diagnoses, and medications for 97 995 adults with COVID-19. We used data on these features and 597 patients from a long COVID clinic to train three machine learning models to identify potential long COVID among all patients with COVID-19, patients hospitalised with COVID-19, and patients who had COVID-19 but were not hospitalised. Feature importance was determined via Shapley values. We further validated the models on data from a fourth site. FINDINGS: Our models identified, with high accuracy, patients who potentially have long COVID, achieving areas under the receiver operator characteristic curve of 0·92 (all patients), 0·90 (hospitalised), and 0·85 (non-hospitalised). Important features, as defined by Shapley values, include rate of health-care utilisation, patient age, dyspnoea, and other diagnosis and medication information available within the electronic health record. INTERPRETATION: Patients identified by our models as potentially having long COVID can be interpreted as patients warranting care at a specialty clinic for long COVID, which is an essential proxy for long COVID diagnosis as its definition continues to evolve. We also achieve the urgent goal of identifying potential long COVID in patients for clinical trials. As more data sources are identified, our models can be retrained and tuned based on the needs of individual studies. FUNDING: US National Institutes of Health and National Center for Advancing Translational Sciences through the RECOVER Initiative.


Subject(s)
COVID-19 , Adolescent , Adult , COVID-19/complications , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19 Testing , Humans , Machine Learning , Pandemics , SARS-CoV-2 , United States/epidemiology , Post-Acute COVID-19 Syndrome
4.
Virol J ; 19(1): 84, 2022 05 15.
Article in English | MEDLINE | ID: covidwho-1846850

ABSTRACT

BACKGROUND: Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of the COVID-19 pandemic in 2020 suggested that ibuprofen was associated with an increased risk of adverse events in COVID-19 patients, but subsequent observational studies failed to demonstrate increased risk and in one case showed reduced risk associated with NSAID use. METHODS: A 38-center retrospective cohort study was performed that leveraged the harmonized, high-granularity electronic health record data of the National COVID Cohort Collaborative. A propensity-matched cohort of 19,746 COVID-19 inpatients was constructed by matching cases (treated with NSAIDs at the time of admission) and 19,746 controls (not treated) from 857,061 patients with COVID-19 available for analysis. The primary outcome of interest was COVID-19 severity in hospitalized patients, which was classified as: moderate, severe, or mortality/hospice. Secondary outcomes were acute kidney injury (AKI), extracorporeal membrane oxygenation (ECMO), invasive ventilation, and all-cause mortality at any time following COVID-19 diagnosis. RESULTS: Logistic regression showed that NSAID use was not associated with increased COVID-19 severity (OR: 0.57 95% CI: 0.53-0.61). Analysis of secondary outcomes using logistic regression showed that NSAID use was not associated with increased risk of all-cause mortality (OR 0.51 95% CI: 0.47-0.56), invasive ventilation (OR: 0.59 95% CI: 0.55-0.64), AKI (OR: 0.67 95% CI: 0.63-0.72), or ECMO (OR: 0.51 95% CI: 0.36-0.7). In contrast, the odds ratios indicate reduced risk of these outcomes, but our quantitative bias analysis showed E-values of between 1.9 and 3.3 for these associations, indicating that comparatively weak or moderate confounder associations could explain away the observed associations. CONCLUSIONS: Study interpretation is limited by the observational design. Recording of NSAID use may have been incomplete. Our study demonstrates that NSAID use is not associated with increased COVID-19 severity, all-cause mortality, invasive ventilation, AKI, or ECMO in COVID-19 inpatients. A conservative interpretation in light of the quantitative bias analysis is that there is no evidence that NSAID use is associated with risk of increased severity or the other measured outcomes. Our results confirm and extend analogous findings in previous observational studies using a large cohort of patients drawn from 38 centers in a nationally representative multicenter database.


Subject(s)
Acute Kidney Injury , COVID-19 , Anti-Inflammatory Agents, Non-Steroidal/adverse effects , COVID-19 Testing , Cohort Studies , Humans , Pandemics , Retrospective Studies
6.
JAMA Pediatr ; 176(8): 819-821, 2022 08 01.
Article in English | MEDLINE | ID: covidwho-1798055

ABSTRACT

This cohort study uses data from the US National COVID Cohort Collaborative to evaluate upper airway infections in children during the surge of the Omicron (B.1.1.529) variant of SARS-CoV-2 in the US.


Subject(s)
COVID-19 , SARS-CoV-2 , Acute Disease , Child , Cohort Studies , Humans , SARS-CoV-2/genetics
7.
J Am Med Inform Assoc ; 29(7): 1172-1182, 2022 06 14.
Article in English | MEDLINE | ID: covidwho-1795238

ABSTRACT

OBJECTIVE: The goals of this study were to harmonize data from electronic health records (EHRs) into common units, and impute units that were missing. MATERIALS AND METHODS: The National COVID Cohort Collaborative (N3C) table of laboratory measurement data-over 3.1 billion patient records and over 19 000 unique measurement concepts in the Observational Medical Outcomes Partnership (OMOP) common-data-model format from 55 data partners. We grouped ontologically similar OMOP concepts together for 52 variables relevant to COVID-19 research, and developed a unit-harmonization pipeline comprised of (1) selecting a canonical unit for each measurement variable, (2) arriving at a formula for conversion, (3) obtaining clinical review of each formula, (4) applying the formula to convert data values in each unit into the target canonical unit, and (5) removing any harmonized value that fell outside of accepted value ranges for the variable. For data with missing units for all the results within a lab test for a data partner, we compared values with pooled values of all data partners, using the Kolmogorov-Smirnov test. RESULTS: Of the concepts without missing values, we harmonized 88.1% of the values, and imputed units for 78.2% of records where units were absent (41% of contributors' records lacked units). DISCUSSION: The harmonization and inference methods developed herein can serve as a resource for initiatives aiming to extract insight from heterogeneous EHR collections. Unique properties of centralized data are harnessed to enable unit inference. CONCLUSION: The pipeline we developed for the pooled N3C data enables use of measurements that would otherwise be unavailable for analysis.


Subject(s)
COVID-19 , Electronic Health Records , Cohort Studies , Data Collection , Humans
8.
Foot & ankle orthopaedics ; 7(1), 2022.
Article in English | EuropePMC | ID: covidwho-1710910

ABSTRACT

Background: The National COVID Cohort Collaborative (N3C) is an innovative approach to integrate real-world clinical observations into a harmonized database during the time of the COVID-19 pandemic when clinical research on ankle fracture surgery is otherwise mostly limited to expert opinion and research letters. The purpose of this manuscript is to introduce the largest cohort of US ankle fracture surgery patients to date with a comparison between lab-confirmed COVID-19–positive and COVID-19–negative. Methods: A retrospective cohort of adults with ankle fracture surgery using data from the N3C database with patients undergoing surgery between March 2020 and June 2021. The database is an NIH-funded platform through which the harmonized clinical data from 46 sites is stored. Patient characteristics included body mass index, Charlson Comorbidity Index, and smoking status. Outcomes included 30-day mortality, overall mortality, surgical site infection (SSI), deep SSI, acute kidney injury, pulmonary embolism, deep vein thrombosis, sepsis, time to surgery, and length of stay. COVID-19–positive patients were compared to COVID-19–negative controls to investigate perioperative outcomes during the pandemic. Results: A total population of 8.4 million patient records was queried, identifying 4735 adults with ankle fracture surgery. The COVID-19–positive group (n=158, 3.3%) had significantly longer times to surgery (6.5 ± 6.6 vs 5.1 ± 5.5 days, P = .001) and longer lengths of stay (8.3 ± 23.5 vs 4.3 ± 7.4 days, P < .001), compared to the COVID-19–negative group. The COVID-19–positive group also had a higher rate of 30-day mortality. Conclusion: Patients with ankle fracture surgery had longer time to surgery and prolonged hospitalizations in COVID-19–positive patients compared to those who tested negative (average delay was about 1 day and increased length of hospitalization was about 4 days). Few perioperative events were observed in either group. Overall, the risks associated with COVID-19 were measurable but not substantial. Level of Evidence: Level III, retrospective cohort study.

9.
JAMA Netw Open ; 5(2): e2143151, 2022 02 01.
Article in English | MEDLINE | ID: covidwho-1669321

ABSTRACT

Importance: Understanding of SARS-CoV-2 infection in US children has been limited by the lack of large, multicenter studies with granular data. Objective: To examine the characteristics, changes over time, outcomes, and severity risk factors of children with SARS-CoV-2 within the National COVID Cohort Collaborative (N3C). Design, Setting, and Participants: A prospective cohort study of encounters with end dates before September 24, 2021, was conducted at 56 N3C facilities throughout the US. Participants included children younger than 19 years at initial SARS-CoV-2 testing. Main Outcomes and Measures: Case incidence and severity over time, demographic and comorbidity severity risk factors, vital sign and laboratory trajectories, clinical outcomes, and acute COVID-19 vs multisystem inflammatory syndrome in children (MIS-C), and Delta vs pre-Delta variant differences for children with SARS-CoV-2. Results: A total of 1 068 410 children were tested for SARS-CoV-2 and 167 262 test results (15.6%) were positive (82 882 [49.6%] girls; median age, 11.9 [IQR, 6.0-16.1] years). Among the 10 245 children (6.1%) who were hospitalized, 1423 (13.9%) met the criteria for severe disease: mechanical ventilation (796 [7.8%]), vasopressor-inotropic support (868 [8.5%]), extracorporeal membrane oxygenation (42 [0.4%]), or death (131 [1.3%]). Male sex (odds ratio [OR], 1.37; 95% CI, 1.21-1.56), Black/African American race (OR, 1.25; 95% CI, 1.06-1.47), obesity (OR, 1.19; 95% CI, 1.01-1.41), and several pediatric complex chronic condition (PCCC) subcategories were associated with higher severity disease. Vital signs and many laboratory test values from the day of admission were predictive of peak disease severity. Variables associated with increased odds for MIS-C vs acute COVID-19 included male sex (OR, 1.59; 95% CI, 1.33-1.90), Black/African American race (OR, 1.44; 95% CI, 1.17-1.77), younger than 12 years (OR, 1.81; 95% CI, 1.51-2.18), obesity (OR, 1.76; 95% CI, 1.40-2.22), and not having a pediatric complex chronic condition (OR, 0.72; 95% CI, 0.65-0.80). The children with MIS-C had a more inflammatory laboratory profile and severe clinical phenotype, with higher rates of invasive ventilation (117 of 707 [16.5%] vs 514 of 8241 [6.2%]; P < .001) and need for vasoactive-inotropic support (191 of 707 [27.0%] vs 426 of 8241 [5.2%]; P < .001) compared with those who had acute COVID-19. Comparing children during the Delta vs pre-Delta eras, there was no significant change in hospitalization rate (1738 [6.0%] vs 8507 [6.2%]; P = .18) and lower odds for severe disease (179 [10.3%] vs 1242 [14.6%]) (decreased by a factor of 0.67; 95% CI, 0.57-0.79; P < .001). Conclusions and Relevance: In this cohort study of US children with SARS-CoV-2, there were observed differences in demographic characteristics, preexisting comorbidities, and initial vital sign and laboratory values between severity subgroups. Taken together, these results suggest that early identification of children likely to progress to severe disease could be achieved using readily available data elements from the day of admission. Further work is needed to translate this knowledge into improved outcomes.


Subject(s)
COVID-19/epidemiology , Adolescent , Age Distribution , COVID-19/complications , COVID-19/diagnosis , COVID-19/therapy , COVID-19/virology , Child , Child, Preschool , Comorbidity , Disease Progression , Early Diagnosis , Female , Humans , Infant , Male , Risk Factors , SARS-CoV-2 , Severity of Illness Index , Sociodemographic Factors , Systemic Inflammatory Response Syndrome/diagnosis , Systemic Inflammatory Response Syndrome/epidemiology , Systemic Inflammatory Response Syndrome/therapy , Systemic Inflammatory Response Syndrome/virology , United States/epidemiology , Vital Signs
10.
J Am Acad Orthop Surg Glob Res Rev ; 6(1)2022 01 04.
Article in English | MEDLINE | ID: covidwho-1606097

ABSTRACT

BACKGROUND: This study investigated the outcomes of coronavirus disease (COVID-19)-positive patients undergoing hip fracture surgery using a national database. METHODS: This is a retrospective cohort study comparing hip fracture surgery outcomes between COVID-19 positive and negative matched cohorts from 46 sites in the United States. Patients aged 65 and older with hip fracture surgery between March 15 and December 31, 2020, were included. The main outcomes were 30-day all-cause mortality and all-cause mortality. RESULTS: In this national study that included 3303 adults with hip fracture surgery, the 30-day mortality was 14.6% with COVID-19-positive versus 3.8% in COVID-19-negative, a notable difference. The all-cause mortality for hip fracture surgery was 27.0% in the COVID-19-positive group during the study period. DICUSSION: We found higher incidence of all-cause mortality in patients with versus without diagnosis of COVID-19 after undergoing hip fracture surgery. The mortality in hip fracture surgery in this national analysis was lower than other local and regional reports. The medical community can use this information to guide the management of hip fracture patients with a diagnosis of COVID-19.


Subject(s)
COVID-19 , Hip Fractures , Adult , Cohort Studies , Hip Fractures/surgery , Humans , Retrospective Studies , SARS-CoV-2 , United States/epidemiology
12.
J Am Med Inform Assoc ; 29(4): 609-618, 2022 03 15.
Article in English | MEDLINE | ID: covidwho-1443051

ABSTRACT

OBJECTIVE: In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations. MATERIALS AND METHODS: We developed a pipeline for ingesting, harmonizing, and centralizing data from 56 contributing data partners using 4 federated Common Data Models. N3C data quality (DQ) review involves both automated and manual procedures. In the process, several DQ heuristics were discovered in our centralized context, both within the pipeline and during downstream project-based analysis. Feedback to the sites led to many local and centralized DQ improvements. RESULTS: Beyond well-recognized DQ findings, we discovered 15 heuristics relating to source Common Data Model conformance, demographics, COVID tests, conditions, encounters, measurements, observations, coding completeness, and fitness for use. Of 56 sites, 37 sites (66%) demonstrated issues through these heuristics. These 37 sites demonstrated improvement after receiving feedback. DISCUSSION: We encountered site-to-site differences in DQ which would have been challenging to discover using federated checks alone. We have demonstrated that centralized DQ benchmarking reveals unique opportunities for DQ improvement that will support improved research analytics locally and in aggregate. CONCLUSION: By combining rapid, continual assessment of DQ with a large volume of multisite data, it is possible to support more nuanced scientific questions with the scale and rigor that they require.


Subject(s)
COVID-19 , Cohort Studies , Data Accuracy , Health Insurance Portability and Accountability Act , Humans , United States
13.
Diabetes Care ; 44(7): 1564-1572, 2021 07.
Article in English | MEDLINE | ID: covidwho-1405389

ABSTRACT

OBJECTIVE: To determine the respective associations of premorbid glucagon-like peptide-1 receptor agonist (GLP1-RA) and sodium-glucose cotransporter 2 inhibitor (SGLT2i) use, compared with premorbid dipeptidyl peptidase 4 inhibitor (DPP4i) use, with severity of outcomes in the setting of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. RESEARCH DESIGN AND METHODS: We analyzed observational data from SARS-CoV-2-positive adults in the National COVID Cohort Collaborative (N3C), a multicenter, longitudinal U.S. cohort (January 2018-February 2021), with a prescription for GLP1-RA, SGLT2i, or DPP4i within 24 months of positive SARS-CoV-2 PCR test. The primary outcome was 60-day mortality, measured from positive SARS-CoV-2 test date. Secondary outcomes were total mortality during the observation period and emergency room visits, hospitalization, and mechanical ventilation within 14 days. Associations were quantified with odds ratios (ORs) estimated with targeted maximum likelihood estimation using a super learner approach, accounting for baseline characteristics. RESULTS: The study included 12,446 individuals (53.4% female, 62.5% White, mean ± SD age 58.6 ± 13.1 years). The 60-day mortality was 3.11% (387 of 12,446), with 2.06% (138 of 6,692) for GLP1-RA use, 2.32% (85 of 3,665) for SGLT2i use, and 5.67% (199 of 3,511) for DPP4i use. Both GLP1-RA and SGLT2i use were associated with lower 60-day mortality compared with DPP4i use (OR 0.54 [95% CI 0.37-0.80] and 0.66 [0.50-0.86], respectively). Use of both medications was also associated with decreased total mortality, emergency room visits, and hospitalizations. CONCLUSIONS: Among SARS-CoV-2-positive adults, premorbid GLP1-RA and SGLT2i use, compared with DPP4i use, was associated with lower odds of mortality and other adverse outcomes, although DPP4i users were older and generally sicker.


Subject(s)
COVID-19 , Diabetes Mellitus, Type 2 , Glucagon-Like Peptide-1 Receptor/agonists , Sodium-Glucose Transporter 2 Inhibitors , Adult , Aged , COVID-19/diagnosis , Diabetes Mellitus, Type 2/drug therapy , Female , Humans , Longitudinal Studies , Male , Middle Aged , Sodium-Glucose Transporter 2 Inhibitors/therapeutic use , United States
14.
JAMA Netw Open ; 4(7): e2116901, 2021 07 01.
Article in English | MEDLINE | ID: covidwho-1306627

ABSTRACT

Importance: The National COVID Cohort Collaborative (N3C) is a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative COVID-19 cohort to date. This multicenter data set can support robust evidence-based development of predictive and diagnostic tools and inform clinical care and policy. Objectives: To evaluate COVID-19 severity and risk factors over time and assess the use of machine learning to predict clinical severity. Design, Setting, and Participants: In a retrospective cohort study of 1 926 526 US adults with SARS-CoV-2 infection (polymerase chain reaction >99% or antigen <1%) and adult patients without SARS-CoV-2 infection who served as controls from 34 medical centers nationwide between January 1, 2020, and December 7, 2020, patients were stratified using a World Health Organization COVID-19 severity scale and demographic characteristics. Differences between groups over time were evaluated using multivariable logistic regression. Random forest and XGBoost models were used to predict severe clinical course (death, discharge to hospice, invasive ventilatory support, or extracorporeal membrane oxygenation). Main Outcomes and Measures: Patient demographic characteristics and COVID-19 severity using the World Health Organization COVID-19 severity scale and differences between groups over time using multivariable logistic regression. Results: The cohort included 174 568 adults who tested positive for SARS-CoV-2 (mean [SD] age, 44.4 [18.6] years; 53.2% female) and 1 133 848 adult controls who tested negative for SARS-CoV-2 (mean [SD] age, 49.5 [19.2] years; 57.1% female). Of the 174 568 adults with SARS-CoV-2, 32 472 (18.6%) were hospitalized, and 6565 (20.2%) of those had a severe clinical course (invasive ventilatory support, extracorporeal membrane oxygenation, death, or discharge to hospice). Of the hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March to April 2020 to 8.6% in September to October 2020 (P = .002 for monthly trend). Using 64 inputs available on the first hospital day, this study predicted a severe clinical course using random forest and XGBoost models (area under the receiver operating curve = 0.87 for both) that were stable over time. The factor most strongly associated with clinical severity was pH; this result was consistent across machine learning methods. In a separate multivariable logistic regression model built for inference, age (odds ratio [OR], 1.03 per year; 95% CI, 1.03-1.04), male sex (OR, 1.60; 95% CI, 1.51-1.69), liver disease (OR, 1.20; 95% CI, 1.08-1.34), dementia (OR, 1.26; 95% CI, 1.13-1.41), African American (OR, 1.12; 95% CI, 1.05-1.20) and Asian (OR, 1.33; 95% CI, 1.12-1.57) race, and obesity (OR, 1.36; 95% CI, 1.27-1.46) were independently associated with higher clinical severity. Conclusions and Relevance: This cohort study found that COVID-19 mortality decreased over time during 2020 and that patient demographic characteristics and comorbidities were associated with higher clinical severity. The machine learning models accurately predicted ultimate clinical severity using commonly collected clinical data from the first 24 hours of a hospital admission.


Subject(s)
COVID-19 , Databases, Factual , Forecasting , Hospitalization , Models, Biological , Severity of Illness Index , Adult , Aged , Aged, 80 and over , COVID-19/ethnology , COVID-19/mortality , Comorbidity , Ethnicity , Extracorporeal Membrane Oxygenation , Female , Humans , Hydrogen-Ion Concentration , Male , Middle Aged , Pandemics , Respiration, Artificial , Retrospective Studies , Risk Factors , SARS-CoV-2 , United States , Young Adult
15.
J Am Med Inform Assoc ; 28(3): 427-443, 2021 03 01.
Article in English | MEDLINE | ID: covidwho-719257

ABSTRACT

OBJECTIVE: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. MATERIALS AND METHODS: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. RESULTS: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. CONCLUSIONS: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19.


Subject(s)
COVID-19 , Data Science/organization & administration , Information Dissemination , Intersectoral Collaboration , Computer Security , Data Analysis , Ethics Committees, Research , Government Regulation , Humans , National Institutes of Health (U.S.) , United States
SELECTION OF CITATIONS
SEARCH DETAIL